Genome-wide identification of CBF genes and their responses to cold acclimation in Taraxacum kok-saghyz

C-repeat binding factors (CBFs) are transcription factors that are known to play important roles in plant cold acclimation. They are highly conserved in most higher plants. Taraxacum kok-saghyz (TKS) is an herb native to China and Kazakhstan and is well-known for its production of rubber silk with industrial and economic value. To understand cold acclimation mechanisms, we conducted a genome-wide discovery of the CBF family genes in TKS and revealed ten CBF genes. A bioinformatic analysis of the CBF genes was carried out to analyze the phylogenetic relationship, protein conservative motifs, protein physicochemical properties, gene structure, promoter cis-acting elements, and the gene expression patterns under cold acclimation and control conditions. It was found that most of these genes were highly responsive at the late stage of cold acclimation, indicating that they play important roles in the cold acclimation processes of TKS. This study provides a theoretical basis for the study of the molecular functions of the CBF gene family in TKS, and a useful guidance for the genetic improvement of the cold tolerance traits of TKS and other plants, including crops.


INTRODUCTION
Low temperature is one of the severe abiotic stresses which negatively affects plant growth and development. To respond to and tolerate the low temperature stress, plants have evolved complex and efficient molecular regulatory mechanisms (Zeng et al., 2021;Zhang et al., 2022). Many plants increase their freezing stress tolerance upon exposure to low but nonfreezing temperatures, a phenomenon known as cold acclimation (Thomashow, 1999). Cold acclimation is a self-protection response to low temperature, and has a positive impact on the improvement of low temperature adaptation of crops. It is a complex process involving many molecular, biochemical, and physiological changes (John et al., 2016). It has been reported that cold acclimated Sonneratia apetala seedlings have lower relative electrolyte leakage and malondialdehyde (MDA) content than non-acclimated seedlings (Shen et al., 2021). Furthermore, it has been shown that cold acclimated Hordeum vulgare (barley) plants have higher cold tolerance and photosynthesis rates than non-acclimated control plants (Jurczyk et al., 2018).
In recent years, the molecular mechanisms underlying cold acclimation have been widely studied, and efforts have been made to explore the key regulatory factors of this complex network (Liu et al., 2019). For example, an inducer of CBF expression 1 (ICE1)-C-repeat (CRT)-binding factors (CBF)-cold responsive (COR) signaling pathway in plant cold stress response has been recently reviewed (Hwarari et al., 2022). In Arabidopsis thaliana, cells can sense cold temperature through calcium/calmodulin-regulated receptor-like kinases (CRLKs) at the plasma membrane. Following cold sensing, an increase in cytosolic Ca 2+ triggers the expression of CBFs, which bind to C-repeat/dehydration-responsive motif (CRT/DRE, G/ACCGAC) in the downstream COR gene promoter region, inducing the expression of COR (Gilmour et al., 1998). AP2 domain in CBFs directly binds to CRT/DRE cis-element in cold-regulated COR promoter (Tang et al., 2020). In A. thaliana, some CBF target genes, such as COR15A and COR78, are particularly relevant to cold acclimation. For example, overexpression of COR15 significantly enhanced freezing resistance (Rocco et al., 2013). The CBFs are to some extent conserved in the plant domain. They belong to a dehydration responsive element binding protein (DREB) subfamily A-1 subgroup in the AP2/ERF transcription factor family (Mizoi, Shinozaki & Yamaguchi-Shinozaki, 2012). The CBFs are plant specific transcription factors; their protein structures contain a highly conserved AP2/EREBP DNA binding domain composed of about 60 amino acids (Lata & Prasad, 2011). Genome of A. thaliana contains six CBF genes AtCBF1, AtCBF2, AtCBF3, AtCBF4, AtCBF5, and AtCBF6. Each of them acts in a different pathway and is differentially expressed under stress conditions (Wang et al., 2014). For example, AtCBF4 is not induced by cold, but its overexpression in A. thaliana can enhance plant tolerance to frost and drought (Haake et al., 2002). Emerging studies in Arabidopsis have explored the transcriptional network of cold acclimation pathways. Compared with the wild type, the freezing tolerance of A. thaliana CBF1 and CBF3 knock-out mutants decreased by about 60% (Novillo, Medina & Salinas, 2007). CBF1/2/3 triple mutants had little freezingsensitive phenotype under normal growth conditions, but showed strong freezing-sensitive phenotype after cold acclimation (Park et al., 2015;Zhao et al., 2016). Through RNA-Seq analysis of CBF1/2/3 triple mutants, it was found that mutation of CBFs affects 10%∼20% of the COR gene expression in the whole transcriptome, indicating other factors important for cold tolerance may function downstream of CBFs (Jia et al., 2016). CBFs clearly play a key role in cold acclimation (Liu et al., 2019). Many studies have demonstrated that CBFs are involved in the transcriptional regulation in Liriodendron chinense (Guan et al., 2021), Triticum aestivum (Zan et al., 2020), Lactuca sativa (Park, Shi & Mou, 2020), Brassica napus (Ghorbani et al., 2020), Brassica oleracea (Thamilarasan et al., 2014, Betula (Lv et al., 2020), soybean (Yamasaki & Randall, 2016, and strawberry (Fattash et al., 2021). Silencing of the CBF genes significantly reduced the frost tolerance of Betula and Brachypodium distachyon (Hao et al., 2017).
Natural rubber (NR) is an important industrial material related to the national economy and people's livelihood. Rubber grass (Taraxacum kok-saghyz, TKS) is a perennial herb of the Compositae family commonly referred to as the Kazakh or Russian dandelion. It is native to Tianshan Mountains of China and Kazakhstan (Kirschner et al., 2013). It is well-known for its rubber production together with the rubber tree and the silver rubber Chrysanthemum. Dry roots of TKS contain rubber silk, which has industrial and economic value. TKS are characterized by strong environmental and cold temperature adaptability; thus, they thrive in cold areas with strong growth and reproductive ability. Therefore, TKS is a great plant for studying the functions of CBF genes in cold tolerance and in improving rubber production. TKS contains large amounts of cis-1,4-polyisoprene in its enlarged roots, which is an alternative plant source of NR (Xie et al., 2019). The amount of NR in the TKS roots is similar to that in the rubber tree Hevea brasiliensis (Kirschner et al., 2013). Therefore, TKS is a model plant for studying NR biosynthesis as it possess a relatively small genome and a fast growth-cycle (Schmidt et al., 2010). Its genome is relatively simple (1.29 Gb), containing 46,731 protein-coding genes (Lin et al., 2018). With the completion of TKS genome sequencing, study of key functional genes at the genome-wide level as well as cloning and expression analysis of resistance-associated genes has become possible. It is valuable to be able to guide the applied research of stress-related genes in TKS and other important crops. Cold stress is a major factor limiting the planting of TKS. Cold acclimation can advance the germination period of TKS in the spring and delay the withering period in autumn so as to improve rubber production by prolonging the growth period. Therefore, it is of great practical significance to study the mechanisms underlying the cold acclimation of TKS. At present, the research on TKS mainly focuses on rubber extraction, structural characterization, variety screening, and rubber synthesis-related genes. There is a lack of systematic research on gene families at the whole genome scale.
This study is aimed to understand the structure and function of the CBF family in TKS. With the continuous advancement of genomic research, gene family analysis at the whole genome level has become a reality (Wang et al., 2019a;Wang et al., 2019b). The acquisition of TKS genomic data provides a basis for the analysis of its gene family (Lin et al., 2018), which facilitates further understanding of gene functions and regulatory mechanisms. Here we studied the TKS CBF gene family, including gene structure, cis-acting elements, conserved motifs, phylogenetic context, and gene expression profiles during cold acclimation. These results have contributed to a deeper understanding of the CBF gene structure and function in plant cold stress response and cold acclimation.

Plant materials, cold acclimation treatment analysis
TKS seeds were provided by the College of Life Sciences, Heilongjiang University, China. They were sown in pots (inner diameter 14 cm, height 15 cm) with seedling vermiculite soil, and grown at room temperature 25 • C under a light intensity of 100 µmol m −2 sec −1 . Figure 1 Experimental design of cold acclimation of seedlings. Schematic diagram of Taraxacum koksaghyz seedlings subjected to different temperature treatments. The control seedlings were grown at 25 • C followed by cold treatment at 5 • C for 0, 3, 6, 12 and 24 h. Cold-acclimation was conducted at 15 • C and 10 • C, each for one day, followed by cold treatment at 5 • C for 0, 3, 6, 12 and 24 h. At four-pair-leaf growth stage, the seedlings were selected and divided into two groups: (1) control: non-acclimated seedlings grown at room temperature 25 • C, followed by at low temperature 5 • C as cold treatment, and (2) cold acclimation: grown at 15 • C for 1 day, followed by a second day at 10 • C prior to a cold treatment at 5 • C ( Fig. 1). Leaves were collected at 0, 3, 6, 12, and 24 h for each group of treatments. They were frozen immediately in liquid nitrogen for RNA extraction according to Shen et al. (2021). For each sample, three biological replicates and three technical replicates were conducted.

Identification of CBF family genes
A. thaliana CBF genome sequences were downloaded from TAIR (http://www.arabidopsis. org). The translated CBF protein sequences were used for local BLAST analysis in the TKS database (Lin et al., 2018), and the corresponding CBF homologous genes were screened with an e-value threshold of <10 −5 . In addition, the presence of the AP2domain was examined using the hmmscan function of HMMER3.0 (http://hmmer.org) (Potter et al., 2018) with the AP2 domain profile (Pfam accession, PF00847). Pfam scan (http://pfam.xfam.org/) verifies whether the candidate protein contains the AP2 domain. Multiple sequence alignment with Clustal Omega program was conducted to determine whether there are PKKPAGR and DSAWR sequences before and after the AP2 domain. The protein sequences with these two characteristics were assigned to the TKS CBF family. TBtools software was used to analyze conserved amino acid residues. Through the website analysis tool ProtParam (http://web.expasy.org/protparam/), ExPASy (https://web.expasy.org/protscale/) (Artimo et al., 2012) and PSORT (http://psort.HGC.JP/), the basic physical and chemical properties of proteins, such as molecular weight (Da), isoelectric point (pI), hydrophilicity, and subcellular localization were analyzed.  Kumar, Stecher & Tamura, 2016). A phylogenetic tree was constructed using a neighborjoining (NJ) method with the default parameters. GFF files with TkCBF chromosome location information were extracted from the TKS genome annotation files based on their starting and ending positions in the chromosomes. Using mapinspect software (https://mapinspect.software.informer.com/), the chromosome locations of TkCBFs were analyzed according to the genome annotation information. Gene intron-extron structure information was collected from the genome annotations of TKS from TKS databases. Then, the exon-intron structures for TKS CBF genes were checked and measured by analysis the sequences (Liu et al., 2013). The TkCBF gene structure was mapped using the TBtools software. The full-length protein sequences were analyzed using the MEME online program (https://meme-suite.org/meme/) (Bailey et al., 2015) to determine the distribution patterns of protein conserved motifs.

Promoter Cis-acting elements analysis of the TkCBF Genes
Based on the genome annotation information provided by the TKS database, a 2,000 bp sequence upstream of the transcription start site of the TkCBF genes was extracted and submitted to Plant CARE (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/) for promoter prediction. GSDS (http://gsds.gao-lab.org/) (Hu et al., 2015) online software was used to draw a distribution map of cis-regulatory elements.

Expression analysis by qRT-PCR
Total RNA was extracted from TKS leaves without cold acclimation at 15 • C and 10 • C (25 • C, 5 • C) and cold acclimation at 15 • C and 10 • C ( Fig. 1) using the Trizol method. Total RNA (1 µg) was used for reverse transcription. The first strand cDNAs for qRT-PCR were synthesized using PrimeScript TM RT Master Mix (TaKaRa, Beijing, China). qRT-PCR was performed with PowerUp TM SYBR TM Green Master Mix (Thermo, Beijing, China), and 18S rRNA was used as the internal reference gene in this experiment. Premier5 was used to design the specific primers for the TkCBF genes (Table S1). Melting curves were acquired to determine primer specificity. Each PCR was performed using three biological duplicates and three technical replicates. Relative gene expression levels were calculated using the 2 − Ct and ANOVA.

Genome-wide identification of TkCBF genes
Sequences of CBF genes in the A. thaliana genome were retrieved from TAIR for local blast identification of TkCBF genes. A total of 24 TkCBF candidate genes were identified by blastp (E-value < 10 −5

Analysis of protein characteristics and chromosome location of TkCBFs
The phylogeny of a species can be reflected by its homology, and therefore can be used to help annotate newly sequenced genes based on known genes. To gain a better understanding of the similarity and dissimilarity of motifs in different TkCBF proteins, we performed multiple sequence alignment and motif distribution analysis. Sequence alignment analysis of the 10 TKS CBFs and six A. thaliana CBF proteins revealed that the AP2/ERF domain of TkCBF proteins has high homology with the characteristic motif of the CBF family. All the identified TkCBFs contained the previously reported CBF-specific conserved domain PKK/RPAGRxKFxETRHP-AP2/ERFdomain-DSAWR motif (Canella et al., 2010) (Fig. 2). Further analysis of the distribution of the TkCBF conserved domains showed 10 conserved motifs predicted by MEME. The 10 motifs found were named Motif 1-Motif 10 (Fig. 3A). Each TkCBF protein has its specific conserved motifs. Since the conserved motifs are highly similar, proteins with the same motifs may be functionally similar. Motif analysis found that all CBFs contained motifs 1, 2, and 3 (Fig. 3A). Combined with the results of multiple sequence alignment (Fig. 2), motif 2 was a typical AP2 domain, and was highly conserved among the TkCBF family proteins. Please note that there are no non-coding sequences, which is consistent with the CBF study of the tea tree (Hu et al., 2020). The conserved motifs may play special functions. The CBFs in TKS with close evolutionary relationships usually have similar structure and number of conserved motifs, implying they may have similar functions.
The chromosomal map constructed revealed that the 10 TkCBF genes were distributed on six independent genome assembly scaffold fragments (Fig. 4). The scaffold utg19621 contained the largest number of TkCBFs (three genes), and two genes on both scaffold utg4653 and utg16021. Other scaffolds each contain one gene, and the distribution of genes in the different scaffolds is inconsistent (Fig. 4) due in large part to differences in the structure and size of the chromosomes.

Phylogenetic and gene structure analyses of the TKS CBF gene family
To study the phylogenetic relationship of the CBFs in TKS and other plants, the members of CBF families from TKS, A. thaliana, O. sativa, Z. mays, and L. sativa were collected to construct the phylogenetic tree. Forty-six CBF members were divided into four groups  5). Group I and Group II contained the CBF genes of rice and maize, but does not include the A. thaliana, L. sativa, and TKS CBF genes. It is speculated that there is a species differentiation in their origin. Group III and Group IV contained L. sativa DREB genes and TKS CBF genes, indicating that these genes co-exist in the same family and Asteraceae. This result implied the difference in evolution between Groups I-II and Groups III-IV. The result clearly showed close relationships between TKS and L. sativa CBF genes, since the genes from these two species were closely clustered in different groups and subgroups of the phylogenetic tree. The 10 TkCBFs in Groups III and IV were homologous to AtCBFs, LsDREBs, and TkCBFs were closely related to A. thaliana and L. sativa, indicating the expression and functional diversity of CBF family members. An important factor that determines the level of gene expression and function is the structure of the gene. The gene structure analysis of TkCBFs revealed that all members of the family contain AP2 structural domains, which are strongly conserved. The gene structure of TkCBFs is consistent with the evolutionary results of its members (Fig. 3B). Among the 10 TkCBF genes in TKS, only TkCBF10 contained one intron. The intron phase was found to be a type 1 intron by sequence analysis (Fig. 3B), and the other 9 TkCBFs were all single-exon structures, presumably all members of this family were intron deletion genes except TkCBF10. The loss of introns may shorten the time required for gene transcription to translation, therefore enabling fast gene expression, and production of functional proteins in response to changes in the plant or the environment (Shu et al., 2015).

Cis-acting elements analysis of the promoter region of TkCBFs
Promoter analysis can help further explore the regulatory mechanisms of TkCBF family members. As shown in Fig. 6, cis-acting elements were predicted in the promoters of the 10 TkCBF genes. Ten types of cis-acting elements relevant to stresses (light, drought, and low-temperature) and phytohormone responses (abscisic acid, auxin, salicylic acid, and methyljasmonate) were discovered in the promoters of the TkCBF genes. Interestingly, the light-responsive elements accounted for the largest portion of all the identified ciselements (Table S3). Among the numerous cis-acting elements, we also identified several MYC motifs, which are binding sites for ICE1 (Table S3). ICE1 is mainly involved in the regulation of the expression levels of CBFs under cold stress, further supporting that TkCBFs may be involved in the ICE1-CBF-COR cold signaling pathway. The results suggest that the expression of TkCBF genes might be associated with different environmental factors, and could potentially play important roles in plant growth and response to harsh environmental conditions. The cis-element data can springboard further hypothesis-testing studies on the expression characteristics and functions of the ten TkCBF genes.

Expression analysis of TkCBF genes under cold acclimation
Here we investigate the expression profiles of the 10 TkCBF genes in the leaves of TKS under cold acclimation and control conditions (Fig. 7). Although the 10 TkCBF genes showed similar expression patterns in response to cold acclimation, they exhibited differential timings and altitudes. For example, the expression of TkCBF3, TkCBF5, and TkCBF6 showed an increase followed by a decrease in the trend and reached the highest expression level within 24 h. TkCBF1, TkCBF2, TkCBF4, TkCBF7, and TkCBF8 showed a similar upward and then downward trend, however, peak expression was reached within 6 h or 9 h, implying that the TkCBF1, TkCBF2, TkCBF4, TkCBF7, and TkCBF8 may be more sensitive to cold signal. Similarly, we also carried out real-time PCR of TKS leaves under non-acclimation/control condition. The expression of these 10 genes was also up-regulated, but the peak time was significantly later than those of cold-acclimated samples (Fig. 7). This finding is consistent with the results from A. thaliana (Gilmour Fowler & Thomashow, 2004;Lin et al., 2008).

DISCUSSION
CBF is a key transcription factor involved in plant cold acclimation (Shi, Ding & Yang, 2018), When plants are subjected to low temperature stress, CBF transcription factors are activated to regulate about 12% of the low temperature responsive transcriptome (Sun et al., 2014). CBF genes were discovered and explored in the reference plant A. thaliana (Thomashow, 2010). Ever since, CBF genes and their homologs have been increasingly studied to improve cold tolerance in other plants (Gehan et al., 2015;Francia et al., 2016). However, CBFs have been rarely reported in TKS, and their biological functions remain unclear. In this study, we conducted a genome-wide search for the CBF orthologs in TKS, and identified a total of 10 CBF genes. Compared with the reference species A. thaliana, TKS has more CBFs. This may be due to the slightly larger TKS genome, and its adaptation to adverse environmental conditions. The physicochemical properties of the CBF proteins revealed that the theoretical isoelectric points of TkCBF proteins are in the similar range as those from Punica granatum and Liriodendron chinense (Guan et al., 2021). The predicted subcellular localization indicate that the TkCBFs are localized in the nucleus for biological roles as transcription factors. The results of the phylogenetic tree indicate that the TkCBFs are closely related to those in lettuce, which makes sense due to the fact that both species belong to the Asteraceae family.
Analyses of gene structure and conserved motif differences can help determine the evolutionary relationships of gene families (Zhou et al., 2020). The composition of introns and exons in the genes is important for their evolutionary analysis. The TKS genome contains only one exon and no intron except for the TkCBF10 gene, which is consistent with the A. thaliana AtCBF1/2/3 gene structure (Medina et al., 1999), and has a closer phylogenetic relationship. The gene structure analysis inferred that the CBF genes containing introns might have mutated during evolution. A total of 10 TkCBF Figure 7 Expression profiles of TkCBF genes in response to cold acclimated and non-acclimated (control) conditions. qRT-PCR expression levels of the 10 CBF genes were determined using the 2 − Ct method with 18S RNA as the internal standard for normalization. The results are average with standard deviation of three biological replicates.
Full-size DOI: 10.7717/peerj.13429/ fig-7 family members were identified in this study, four more than the model species A. thaliana. This could be attributed to the difference in genome size and the degree of CBF gene amplification. CBFs within the same species have similar conserved motifs and gene structures. All the 10 TkCBFs genes encode proteins containing AP2 structural domains that are highly conserved, which is consistent with the results of CBF studies in Camellia sinensis (Wang et al., 2019a;Wang et al., 2019b). Intron-deficient genes can rapidly complete transcriptional processes under abiotic stress, enabling efficient plant response to stress signals (Giri et al., 2013). The TkCBFs are mostly intron-deficient genes, and it is hypothesized that they are more responsive to abiotic stresses. Most of the CBF gene family members in TKS consist of similar motifs, and the TkCBF proteins all have motifs 1, 2, and 3, suggesting that the functions of the CBF family proteins may be mediated by these three motifs. The similarity of most CBF proteins in gene structure and motif composition is consistent with the phylogenetic result of the CBF gene family. Compared with the A. thaliana cold-responsive gene AtCBF1-3 (Fig. S1), which has extremely similar conserved motifs, we predict that all 10 CBF genes of TKS are likely to be cold responsive genes. These results suggest that the CBF gene family may have similar functions in the cold acclimation. The promoter contains cis-acting elements located upstream of the transcriptional start, and they are at the center of gene transcriptional control. Analysis of promoter cis-acting elements showed that the promoters of all the 10 TkCBF genes have light-responsive signaling elements (Fig. 6). In A. thaliana, inhibition of phytochrome interaction factor (PIF) by shortening daylight hours resulted in up-regulation of CBF gene expression and improved freezing tolerance (Lee & Thomashow, 2012). Therefore, it is reasonable to hypothesize that TkCBF genes are involved in light signaling to regulate the growth and development processes. In addition, the expression of CBFs was affected by gibberellin (GA), jasmonic acid (JA), abscisic acid (ABA), and ethylene (ET). In addition, the present study screened the promoters of the 10 TkCBF genes for cis-acting elements responding to hormones such as ABA, GA, JA, auxin, and ET. This result will help us further elucidate the regulatory mechanisms of hormone signaling and cold acclimation in TKS.
Some reports showed that a triple mutant of AtCBF1, AtCBF2, and AtCBF3 in A. thaliana was extremely sensitive to freezing stress compared with the wild type control (Zhao et al., 2016;Zhao et al., 2016). AtCBF1, AtCBF2, and AtCBF3 were induced at 15 min under cold stress treatment, and their expression peaked at 2 h and then gradually decreased. (Gilmour et al., 1998). The expression patterns of CBFs in tomato and cotton were basically the same as that of A. thaliana. They all reached peak expression at 2 h. (Zhu et al., 2016). However, under cold treatment, the expression of most TkCBFs increased with the increasing treatment time, and showed a trend of increased expression, peaked at 6 h and then declined, a pattern similar to that of tea tree (Hu et al., 2020), Prunus mume (Zhao et al., 2018), and Brassica rapa L. (Jiang et al., 2011). In contrast to non-overwintering plants such as A. thaliana, tomato and cotton, TKS and tea tree usually experience a longer period of low-temperature chilling during the growing season. Therefore, the induction of TkCBFs during cold treatment takes longer time. It is also interesting to note that the expression pattern of TkCBF2 is most similar to that of the cold responsive AtCBFs, consistent with their sequence similarity. Future experiments should focus on investigating the cold-responsive molecular functions of these TkCBFs (e.g., their downstream target genes and pathways).

CONCLUSIONS
CBF proteins play a prominent positive role in the process of cold acclimation, and they are highly conserved in higher plants. In our study, we identified 10 CBF members in TKS. A comprehensive phylogenetic analysis of CBF family members in A. thaliana, maize, rice, and lettuce showed evolutionary relationship. In addition, protein conserved domains, protein physical and chemical properties, gene structure, promoter cis-acting elements, and gene expression patterns under cold acclimation and non-acclimation conditions were studied. Most CBF genes in TKS seedlings after cold acclimation took longer to respond to cold signal than those without the acclimation. This research represents a theoretical basis for a comprehensive understanding of TkCBF gene family. The results will springboard further studies on the molecular functions of the TKS CBF gene family, and on future genetic improvement of cold tolerance of TKS and crops.